On the Insensitivity of User Distribution in Multicell 
Networks under General Mobility and Session Patterns 

Wei Bao and Ben Liang 
Department of Electrical and Computer Engineering, University of Toronto, Canada 
Email: {wbao, liang}@comm.utoronto.ca 



o 

(N 

> 

O 

O 

in 



<Z3 



> 

m 
m 

d 

(N 



X 



Abstract — The location of active users is an important factor in 
the performance analysis of mobile multicell networks, but it is 
difficult to quantify due to the wide variety of user mobility and 
session patterns. In particular, the channel holding times in each 
cell may be arbitrarily distributed and dependent on those in 
other cells. In this work, we study the stationary distribution 
of users by modeling the system as a multi-route queueing 
network with Poisson inputs. We consider arbitrary routing 
and arbitrary joint probability distributions for the channel 
holding times in each route. Using a decomposition-composition 
approach, we show that the user distribution (1) is insensitive 
to the mobility and session patterns, (2) depends only on the 
average arrival rate and average channel holding time at each 
cell, and (3) is completely characterized by an open network with 
M/M/oo queues. This result is validated by experiments with 
the Dartmouth user mobility traces. 



I. Introduction 

In designing ever more efficient and capable mobile access 
networks, the accurate modeling of how user mobility and 
session connectivity patterns affect network performance is of 
paramount interest. However, compared with wired networks, 
the analytical modeling of mobile networks is burdened with 
many additional technical challenges. Some of the most diffi- 
cult factors are the following: 

« The movement of users may be individually arbitrary, 
without following any common mobility pattern (TJ. 

• The session durations may have a general probability 
distribution, supporting diverse data and multimedia ap- 
plications (2). 

• The channel holding times at different cells are correlated, 
dependent on the speed or trajectory of different users Q. 

To facilitate tractable analysis, existing studies often adopt 
simplified models. For example, classical models assume ex- 
ponentially distributed cell dwell times and session durations, 
resulting in independent and memoryless channel holding 
times |4|-(6). Although more general mobility and session 
models have been considered in the past literature Q, 0, 
ITl- lfTOl . to the best of our knowledge, none addresses all of 
the challenges above. 

In this paper, we study the distribution of active users in a 
multicell network, which has important utilization in network 
management and planning. Prior studies have proposed several 
analytical models to estimate the user distribution with various 
degrees of detail and generality IfTTI - lfTSl . Instead, we consider 
general mobility and session patterns, only requiring that the 
new session arrivals form a Poisson process, which is well 
supported by experimental data lfT31l - lfT7l . We model the user 



mobility with a general system with multiple routes, each 
representing one type of users with a specific movement 
pattern. A general probability distribution is used to represent 
the session durations. As a consequence, the channel holding 
times at different cell sites are no longer independent. 

Through a decomposition-composition approach, we derive 
a closed-form expression for the joint stationary distribution 
for the number of users in all cells. A central observation is 
made, that this expression depends only the average arrival 
rate and average channel holding time at each cell. This leads 
to four important conclusions on the user distribution: first, 
it is insensitive to how the users move through the system; 
second, it is insensitive to the general distribution of session 
durations; third, it is intensive to the correlation among the 
channel holding times; and fourth, it is perfectly captured by 
an open Jackson network with M/M/oo queues. This result 
confirms and provides analytical support to the experimental 
observations of lfT31l . We conduct further experimental val- 
idation using the Dartmouth traces lfl8l with 152 APs and 
more than 5000 users, which shows that the analysis accurately 
predicts real-world user distributions. 

The conclusion of this work has important consequence to 
performance analysis and practical system design. It suggests 
that accurate calculation of the user distribution, and other as- 
sociated metrics such as the system workload, can be achieved 
with much lower requirement for system parameter estimation 
than previously expected. Furthermore, the simplicity of the 
resultant product-form user distribution enables further analyt- 
ical endeavors in system optimization. 

The rest of the paper is organized as follows. In Section HU 
we discuss the relation between our work and prior works. In 
Section [Till we present the system model. In Sections |IV] and 
[VJ we derive the analytical stationary distributions for single- 
route and multiple-route networks, respectively. In Section [VT] 
we validate our analysis with experimental results from the 
Dartmouth traces. Finally, concluding remarks are given in 
Section NE 



II. Related Work 

A. User Distribution in Mobile Networks 

The user distribution is an important factor in the man- 
agement and planning of mobile networks. However, rela- 
tively few analytical models are available in the literature. 
The uniform user distribution has been widely adopted for 
mathematical convenience, but it does not account for non- 
homogeneity in the physical topology and is incorrect in some 
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cases. For example, it is well known that the user distribution 
is non-uniform under the random waypoint model ifTTl . 

Other previous works have proposed analytical models 
using stochastic queueing networks to derive the user distribu- 
tion in different environments, including wireless multimedia 
networks 1T21 , vehicular ad-hoc networks |[T3l , and WLANs 
03], |fl5l . However, they do not allow arbitrary mobility or 
arbitrary session patterns. In terms of user movement, IPT21 . 
iTPD . and |[l"5l assume that users move from one cell to another 
probabilistically and memorylessly, while lfT4l focuses on 
scattered single cells, so that user movement among multiple 
cells is not discussed. In terms of channel holding times, lfl2l 
uses the sum of hyper-exponentials or the Coxian distribution 
to approximate arbitrary distributions; (14\ assumes generally 
distributed channel holding times but concerns only a single 
cell; and ||T3ll and |fl5l model the system as an open network 
with M/G/oo queues. None of them consider the dependence 
among channel holding times. 

Although 0151 uses simpler mobility and session models, 
the authors have observed a surprising match between analysis 
and experimental data from the Dartmouth traces. Our work 
provides an analytical explanation for this, since we show that 
an even simpler open Jackson network with M jMj oo queues 
suffices to completely characterize the user distribution under 
general mobility and session patterns. 

B. Spatial Poisson Process 

An open M/M/oo Jackson network implies that the number 
of users in each cell is independently Poisson. This spatial 
non-homogeneous Poisson model is commonly used in the 
geometric analysis of interference in wireless networks |[T9l , 
B20l . It can be inferred from associating the user trajectory 
as location-dependent marks to a space-time Poisson process 
representing the entry location and time of the users lETl . In 
this work, we arrive at the same conclusion using a different 
approach, which additionally shows that the mean values for 
the Poisson distributions in different cells are insensitive to 
the arbitrary and dependent channel holding times. This en- 
ables simple yet accurate computation of system performance 
measures. 

C. Insensitivity Property 

The insensitivity of queueing networks indicates the sit- 
uation where the stationary distribution remains unchanged 
while the distribution of service times takes arbitrary forms. 
When the service times are assumed independent among 
different queues, there are several well known conditions for 
insensitivity. For example, networks with symmetric queues 
are insensitive [221. In [[23l and (I24). the partial balance of 
probability flows is shown to be a sufficient condition for 
insensitivity. In 11251 . partial reversibility is shown to be a 
necessary and sufficient condition for insensitivity. However, 
none of these known results consider the case where the 
service times between different queues are dependent. For 
example, the queueing network most closely related to ours is 
one with M/G/oo queues. It is known to be insensitive when 
the service times are independent |[22), but to the best of our 



knowledge, there is no further general result for dependent 
service times. 

III. System Model 

Consider a cellular network with C cells. There are L 
independent routes, each defined as a finite ordered sequence 
of cells. The jth stage on the Zth route corresponds to the jth 
cell in the sequence, which is denoted as c(l,j). Let Ni be 
the number of stages on the Zth route. Each user of the Zth 
route starts a new session in cell c(l, 1); then it moves along 
the route through cells c(7, 1), c(l, 2) . . . c(7, Ni), as long as the 
session remains active. The user is considered to have departed 
the network when its session terminates or when it exits cell 
c(l,Ni). We allow an arbitrary number of arbitrary routes to 
cover all possible movement patterns. 

For each route, we assume the arrivals of new sessions to 
form a Poisson process. Note that although the arrivals of 
packets in the Internet may not form Poisson processes 1261 . 
the arrivals of new sessions are at a much larger time scale 
and are well justified as Poisson lfl6l . ATI . Furthermore, in 
|[T5l and later in Section IVII experimental data show that new 
sessions in the type of mobile networks under consideration 
are indeed Poisson barring some extreme cases. We emphasize 
that only the new session arrivals are Poisson, while the hand- 
off arrivals at each cell have general statistics with complicated 
dependencies. 

The session duration of a user on the Ith route is modeled 
as an arbitrarily distributed random variable T\. Let A;o be the 
new session arrival rate at the Ith route. After a new session 
arrival, let t;i denote the residual cell dwell time of the user in 
the 1st stage on the Zth route, which is arbitrarily distributed. 
Let Tij, 2 < j < Ni, denote the cell dwell time of the user 
in the jth stage on the Zth route, which are also arbitrarily 
distributed. Then, the channel holding time of the jth stage 
on the ith route, tij, if it exists, can be represented as follows: 

!tn = min{T;,T(i}, 
j'-i j'-l 
tij = min{T ( - rti , Tij },ifT l >J2 T i t , 2 < j < N t . 

Fig. Q] shows an example network with 3 routes. Route 1 
starts from cell 1 and passes cell 3, 4 and 6 (i.e., c(l, 1) = 1, 
c(l,2) = 3, c(l,3) = 4 and c(l,4) = 6). A user starts a 
session in cell 1, and the session is terminated in cell 4. The 
corresponding T±, T\x, t\2, T13, ti%, t\2, and t\% are labeled in 
the figure. 

Since there is a one-to-one mapping between active users 
and sessions, we do not distinguish the two. Note that given 
a route /, we know the sessions start at cell c(l, 1) but we 
do not know where they end. Furthermore, we do not assume 
independence between 2] and Tij, and the channel holding 
times tij are not independent either. 

Let xij, 1 < I < L, 1 < j < Ni, denote the number 
of active users in the jth stage on the Ith route; let y n , 
1 < n < C, denote the number of active users in the nth 
cell. Let x = [{xij : 1 < I < L, 1 < j < Ni}] T and 
y = [yi, J/2, • ■ • ,yc] T - We aim to derive 7r(x) and 7Ti(y), 
the joint stationary distributions for x and y, respectively. 

A partial list of nomenclature is given in Table J] 
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Fig. 1. System model. 



TABLE I 

Selected Definition of Variables 



Name 


Definition 


tij 


Random variable: on the Zth route, 
channel holding time at the jth stage. 


tikj 


Random variable: on the Zth route, channel holding 
time at the jth stage given that there are k stages. 


tij 


On the ith route, the average value of ty, 
given that the number of stages > j. 


— ^ 

tik 


Random vector: (tiki, • • • , U k k}- 


tlkij 


Constant: on the ith route, ith realization of channel 
holding time at jth stage, when session lasts k stages. 


tiki 


Constant vector: {tikii, ■ ■ ■ ,tiuk}- 


Plk 


On the ith route, probability 
that a session lasts k stages. 


Qlki 


On the Ith route, probability of the ith realization of 
a session, given that a session lasts for k stages. 


Plki 


On the Ith route, probability that a session lasts k 
stages and is in the ith realization, Pik% = pikqiki- 


AiO 


Arrival rate of the ith route. 


Ay 


Iftij. 




Arrival rate of sessions lasting k stages, in the ith 
realization, on the ith route, Xikio — PikiXio. 




1/tlkij. 


W'lj 


Invariant measure of memoryless network. 


Wlj 


Wij/Xij. 


W lkij 


Invariant measure of decoupled memoryless network. 


Wlkij 


w 'lkij / ^Ikij ■ 


An 


Average arrival rate of the nth cell. 


in 


Average channel holding time of the nth cell. 



IV. User Distribution in Single-Route Network 

We first derive the stationary user distribution on a sin- 
gle route. We construct a reference single-route memoryless 
network, where all the channel holding times are indepen- 
dently and exponentially distributed. We prove insensitivity 
by showing an equivalence between the original network and 
the memoryless network in terms of user distribution. 
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(a) Single-route network 
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(b) Reference single-route memoryless network. 
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(c) Decoupled network. 
Fig. 2. Single-route network and decomposition. 

A. Queueing Network Model for Single-Route Network 

Consider exclusively the Zth route in the network. Through- 
out this section, we will carry the route index I in most 
symbols, since they will be re-used in the analysis of multiple- 
route networks. 

As shown in Fig. HJa), we model the route as a tandem- 
liked queueing network, except with early exists. The node 
labeled with represents the exogenous world. The jth queue, 
1 < 3 a Ni, represents the jth stage of the route, and units in 
this queue represent sessions in the jth stage. Each queue has 
infinite servers, since the sessions are served in parallel with 
no waiting. The channel holding time of a session in the jth 
stage, tij, is equivalent to the service time of the jth queue. 
The handoff of a session from the jth stage to the (j + l)th 
stage is equivalent to a unit movement from the jth queue to 
the (j + l)th queue. The termination of a session is equivalent 
to the movement from a queue to node 0. 

Let pik denote the probability that a session lasts for k 
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stages. It is given by 

fc-i 



K—l K 

P[Y, T U < T l< J2 Tl ] , for 2 < ft < JVj - 1, 



3=1 



with pn = P[Ti<m) and p m 



P 



Note that we have "Y^kLiPik = L Given a session in the fcth 

V™ ! pi ■ 



E,=i 



stage, it enters the (ft+ l)th stage with probability —^n- 



and terminates with probability 



pik 



EJV, 



B. Reference Single-Route Memoryless Network 

We define a reference single-route memoryless network, as 
a Jackson network with the same topology as the original 
single-route network, where each queue has infinitely many 
independent and exponential servers. An illustration is shown 
in Fig. I2b). By matching the mean service times in this 
memoryless network with those of the original network, we 
see that its external arrival rate is A/o, the service rate of the 
jth queue is Ay — J-, the service rate from the fcth queue 

E™i Plj 

to the (k + l)th queue is — J 7, fc+1 — -Xik, and the service rate 

EjifcPu 

from the fcth queue to node is — ffi — A;fc. 

EjifcPij 

Let w[ - be the positive invariant measure of the jth queue 
that satisfies the routing balance equations of the single-route 
memoryless network, with the convention that at node 0, w' Q = 
1. It can be derived from the topology of Fig. |2jb) that 



w, 



Azo, 



4- = A,o(l-5^pj„), 2<j<N t . 



(1) 



n=l 



Let wij = j^ 2 -. Then the stationary distribution of this network 

is ED 

1 



7T0 



3=1 



(2) 



C. Insensitivity of Single-Route Network 

For the original single route network, we employ a 
decomposition-composition approach to derive its stationary 
user distribution. 

Given that one session lasts for fc stages, we denote the 
channel holding times as a fc-dimensional random vector tik = 
{tiki , ■ ■ -tikj, ■ ■ ■ , tikk}, where tikj is the channel holding time 
at the jth stage. We assume that tik is an arbitrarily distributed 
discrete random vector with Mik possible realizations^. For 
any i, 1 < i < Mik, we define a fc-dimensional deterministic 
vector ti k i = [tikiu- ■ ■ Jikij,- ■ ■ Jikik} T corresponding to the 
ith realization of t^. Let qiki be the probability of the ith 
realization given that the session lasts for fc stages. Also, let 
Piki = Pikliki denote the probability that a session lasts for fc 
stages and it is in the ith realization. 

'For a vector of continuous channel holding times, we can use a sequence of 
discrete distributions with decreasing granularity to approach its distribution. 



By do so, we decompose the original network into a 
multiple-branch queuing network as shown in Fig. [2jc), which 
is referred to as the decoupled network. In this network, there 
are Ni main branches, where the fcth main branch represents 
the event that a session lasts for fc stages. The fcth main 
branch contains Mik sub-branches, where the ith sub-branch 
represents the realization where tik = tiki- Furthermore, 
the jth queue in the ith sub-branch of the fcth main branch 
represents the jth stage of the ith realization of the sessions 
that last for fc stages. 

Hence, each queue of the decoupled network has infinite 
servers with deterministic service time, tikij, for the jth stage 
of the ith sub-branch of the fcth main branch. Furthermore, 
the arrival rate of the ith sub-branch of the fcth main branch 
is Xmo = PmXiQ. Let x = [{x m] : 1 < fc < N h 1 < j < 
fc, 1 < i < Mik}] T be the vector of number of sessions in 
the jth stage of the ith sub-branch of the fcth main branch. 
Denote by 7Td(x) the stationary distribution of the decoupled 
network. 

Note that the stationary distribution of a Jackson network 
with infinite servers at each queue is insensitive with respect 
to the distribution of the service times 1231 . Therefore, 7Td(x) 
remains unchanged if we create a reference Jackson network 
by replacing each queue in the decoupled network with a 
queue that has exponential service time with service rate 
Xikij = r^— ■ Let w' lkij be the positive invariant measure of 

*lkij 

the jth stage of the ith sub-branch of the fcth main branch 
of the memoryless version of the decoupled network, which 
satisfies the routing balance equations with the convention that 
at node 0, w' = 1. Since each sub-branch is a chain network, 
we have 



u lkij 



= PlkiXlo- 



(3) 



Let wmj = -r^ 
decoupled network is 



Then the stationary distribution of the 



7Td(x) 



N t N, M lk 

nnn e 

3=1 k=j i=l 



■Wi kij ~Xlki; 

w lkij 



■Elkij • 



(4) 



Next, we re-compose 7r(x) by summing up 7T£i(x) satis- 
fying xij = Y.k=jY.i=iXiki 3 , Vj. To derive ?r(x), we first 
introduce the following lemma. 

Lemma 1: Consider a stationary open Jackson network with 
N queues each with an infinite number of servers. Let Xj be 
the number of units in the jth queue and x = [xi, . . . xm] t ■ 
Suppose {Ji, J2, ■ ■ ■ Jm} is a set of mutually exclusive 
subsets of {1, 2, . . . , N}. Let Zi — J2jej- x j> i = 1, 2, . . . , M, 
denoting the sum of units in the queues inside J t . Then, the 
distribution of z = \z\ , 



IT 
Z M \ IS 

M 



7T(z)=n< 



Vi vT- 



1 



Zi'. 



(5) 



where Vi = J^jej- w j> anc ' w j i s tne expected number of units 
in the jth queue. 

Proof: For a Jackson network with infinite servers at each 
queue, the stationary queue lengths are independent Poisson 
random variables with mean Wj for the jth queue. Hence, Zi 
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Fig. 3. Multiple-route network. 

is Poisson with mean v,i = YljeJ w i ^ or a ^ Furthermore, 
since {Ji} are mutually exclusive, {z{\ are independent. ■ 
Next, we note that the expected service time spent in the 
jth stage given that the j'th stage exists, i.e., j < k for the fcth 
main branch, can be computed as 

k=j Z^i=l r lkiHkij 

tl 'i ~ ST Nl X^ M "* p 

2^ik=j Zjj=l r lki 

fc=j Z^i=l r lkiHkij 

1 - En=l P'" 

Combining this with (O, we have 

N t M lk N, M lk 



(6) 



^loPlki 



= E E ^loPlkiUkij 
k—j i—1 

= A/ (l - E Pin)% 
n=l 

= ^(i-E^) 

J n—l 

= wi 3 . (7) 
Therefore, by Lemma [U we have 

7r(x) = E 7T D (x) 



=n 



(8) 



which is restated as the following theorem: 

Theorem 1: The single-route network has the same sta- 
tionary distribution as that of the corresponding single-route 
memoryless network: 7r(x) = 7To(x). 

V. User Distribution in Multiple-Route Network 

In this section, we study the general case with multiple 
routes. We first extend the results from the previous section 
to show 7r(x) = 7r (x) in a multiple-route network. We then 
derive the user distribution n\{y) with respect to cells and 
show its insensitivity. 



A. Queueing Network Model for Multiple-Route Network 

Since the L routes are independent, we model the multiple- 
route network as a paralleling of L single-route networks, as 
shown in Fig. [3] Similar to Section lTVl we consider a reference 
multiple-route memoryless network, which is a paralleling 
of L corresponding single-route memoryless networks. Then, 
we construct the decoupled multiple-route network, which 
is a paralleling of L corresponding single-route decoupled 
networks. 

B. Insensitivity of 7r(x) 

Theorem 2: The multiple-route network has the same sta- 
tionary distribution as that of the corresponding multiple-route 
memoryless network. 

Proof: Since the routes are independent, the stationary 
distribution of the multiple-route network can be computed 
as the product of the stationary distribution of single-route 
networks: 

L 2V; 

1 = 1 j = l 13 ' 

Since the same holds for the multiple-route memoryless net- 
work, we have 7r(x) = 7To(x). ■ 

C. Insensitivity of 7Ti (y) 

Let A rl be the average total arrival rate to cell n, including 
both new and handoff arrivals. Let t„ be the average channel 
holding time in cell n, considering all routes and stages. Thus, 

a„= e EE A,A " (10) 

l,j:c(l,j)=n k=j i=l 



in = 



2~2l,j :c (l,j)=n 2~2k=j 2~2i=l MoPlkitlki 
l.i:c(U)=n 2^k=j 2-ii=l ^10+lki 



(ID 



Then from ©, we have 

N, M lk 

Antra = E E E ^loPlkiUkij 
l,j:c(l,j)=n k=j i=l 
N, M lk 

= E EE ™ m i 

l,j:c(l,j)=n k=j i=l 

= E w u- w 

l,j-c(l,j)=n 

The joint user distribution among all cells can be computed 
as a summation over those entries of 7To(x) satisfying y n = 
2~2i j-c(l j)=n x lj< Then from Lemma [TJ we obtain 



L at, 

^(y)= E nil' 

x *=Eij :e (ij)=„ x h -,Vn 1=1 J-l 



1 



JJ e -(Eij..(iJ3=»«"«) ( wij] — 

\l,Mlj)=n J V " 
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We make the following observations from (fTJI i: 

• The number of users in each cell is independent and 
Poisson. This is in accordance with Theorem 9.27 in ETI . 

• The stationary distribution depends only on the average 
arrival rates and average channel holding times in indi- 
vidual cells, having the exact same form of an M/M/oo 
open Jackson network. It is insensitive with respect to the 
distribution of channel holding times, or the correlation 
among them. Furthermore, it is insensitive with respect to 
movement patterns, since the exact routing in the network 
is irrelevant. 

• The marginal distribution within a single cell depends 
only on the average arrival rate and average channel 
holding time at that cell. This useful property facilitates 
efficient system management and planning in practice, 
helping to avoid the need for collecting a large amount 
of user location data. 

VI. Experimental Study 

In this section, our analysis is validated via experimenting 
with real-world traces. We first present the data source and 
experimental settings. We then compare the experimental and 
analytical results. 

A. Requirements and the Dartmouth Traces 

There are serval publicly available traces online, including 
the Dartmouth traces 123 J28) CD, the UCSD traces |29l , 
the IBM- Watson traces J30l, and the Montreal traces 13D . 
To choose proper traces, we need to consider the following 
requirements. First, there should be a large amount of sample 
points to facilitate computing the stationary distribution by 
relative frequency. Note that the support of the stationary 
distribution increase exponentially with the number of cells. 
Most available traces do not have a large enough data set. 
Second, the location of cells should be close enough so that 
there is enough handoff traffic among them. Data from already 
independently operated cells are not rigorous enough to test 
our analytical model. 

To the best of our knowledge, the Dartmouth traces are the 
most recent public traces satisfying both requirements. They 
have been widely studied in the literature E3 E3 EH fBl 
In our experiment, we use data from the academic area in the 
Dartmouth traces, with 152 APs and more than 5000 users, 
during a 17-week period (Nov. 1, 2003 to Feb. 28, 2004) 
ifTHl . We focus on the Simple Network Management Protocol 
(SNMP) logs, which are constructed every five minutes, when 
each AP polls all the users attached to it. Each polling message 
includes the information such as the name of AP, timestamp, 
the MAC and IP addresses of users attached to it, signal 
strength, and the number of packets transmitted. By analyzing 
such data, we can derive the average arrival rate, average 
channel holding time, and the stationary distribution by relative 
frequency. 

B. Data Preprocessing 

1) Data Extraction: Since the behavior of users may 
change greatly between daytime and nighttime, or workdays 



and holidays, we focus on data accumulated from 9 am to 5 pm 
on Monday to Friday. We also discard the data accumulated 
during the periods of holiday breaks, including Thanksgiving 
(Nov. 26, 2003 to Nov. 30, 2003) and Christmas and New Year 
(Dec. 17, 2003 to Jan. 4, 2004). In addition, for some APs, 
we observed periods when they are temporally power off. If 
the total service time of an AP on a certain day is less than 
1/3 of its average value, we discard the data for this day. 

2) Trace Gap Padding: The session duration is defined 
as the period of time during which a user is continuously 
connected to the network. The user may move from one AP 
to another during a session. Occasionally, a user may disappear 
from the SNMP report and soon reappear. This may be caused 
by the user departing and then returning to the network, or 
due to the missing of an SNMP report. Following the solution 
proposed in fl5l . we set a departure length threshold Td = 10 
minutes. Only if a user disappears and reappears within Td, it 
is regarded as staying in the network and the missing SNMP 
logs are padded. 

3) Multiple Association and Ping-Pong Effect: We also 
observe that some users are simultaneously associated with 
multiple APs within a small time interval. Some even ping- 
pong among multiple APs. We use two methods to offset these 
effects. First, when multiple associations occur, we check the 
number of packets exchanged with the user. We deem the user 
is associated with the AP which has exchanged the largest 
number of packets with the user during its multiple association 
period. In addition, if a user leaves one AP and then returns 
within 5 minutes, it is regarded as having stayed in the AP. 

4) Open Users: a fraction of the users may stay in the 
system during almost all working hours. These users are re- 
garded as closed users. Since our analytical model assumes an 
open network, the closed users are excluded in our experiment. 
If a user stays for greater than or equal to 7.5 hours during 
working hours on a valid day, it is regarded as a closed user. In 
our experiment, we observe that 9.91% of all users are closed 
users. An analytical model for accommodating closed users is 
provided in lfT31 . which can also be applied to our work. 

C. Trace Analysis 

1) Poisson Arrivals: Analysis of the Dartmouth trace in 
lfT31l has shown that the overall new session arrivals into the 
network are well modeled by a Poisson process. In this work, 
we further test the arrival process of new sessions at each 
AP against the Poisson assumption. This is divided into two 
steps. In the first step, we run an independence test, which 
indicates whether the number of arrivals in different time 
intervals are independent. Since it is not practical to account 
for all time intervals, we test the independence of arrivals in 
two consecutive hours at each AP. If the AP passes the test, we 
regard the arrivals at this AP to be sufficiently independent. 
Let H2 denote the entropy in the number of new arrivals in two 
consecutive hours and H\ denote the entropy in the number 
of arrivals in one hour. Let 77 = 2Hl H2 H2 be the normalized 
entropy gap. If r\ < 0.15, we regard the AP as passing the 
independence test. We observe that 144 of the 152 APs pass 
the independence test. 
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TABLE II 
Number of stages 



Stages 


1 


2 


3 


4 


> 5 


Observations 


80448 


15767 


7410 


3553 


6107 



In the second step, we run a Poisson distribution test, 
which indicates whether the number of arrivals are Poisson 
distributed in a fixed time interval. For each AP that passes 
the independence test, we count the number of new arrivals 
in each hour and calculate its real distribution. Furthermore, 
by using the actual average arrival rate per hour, we can 
determine the corresponding theoretical Poisson distribution. 
Then, we compute the Kullback-Leibler (KL) divergence H 
between the real distribution and the theoretical distribution. 
Let 9 = -g 2 - be the normalized KL value. If 9 < 0.15, we 
regard the AP as passing the Poisson distribution test. We 
observe that 124 of the 144 APs pass the Poisson distribution 
test. 

Those 124 APs are referred to as valid APs, as the new 
arrivals at these APs can be well approximated as Poisson. The 
other 28 APs are referred to as invalid APs. In our experiments, 
we study the effects of both including and excluding the non- 
Poisson new sessions. We emphasize that the Poisson test 
is for new arrivals only. Even for those APs that pass the 
Poisson test, the overall session arrival process includes both 
new arrivals and handoff arrivals and hence is non-Poisson. 

From the SNMP logs, we observe that the invalid APs tend 
to have occasional bursty arrivals. Since they are within the 
academic area, we conjecture that they correspond to large 
classrooms, which experience periodic rushes a the beginning 
of lecture hours. Even though such APs do not match our 
analytical model, their user distribution is likely easy to predict 
in practice. 

2) Number of Stages and Channel Holding Times: We have 
collected the distributions of number of stages in each route, 
which is shown in Table HU It can be seen that there is a 
large percentage of sessions staying for just one stage. To 
rigorously test the analytical user distribution, we will later 
present different cases where one-stage sessions are either 
included or excluded. 

Fig. @] shows the real distributions of channel holding 
times in different stages. The results illustrate that none of 
them are exponentially distributed. Furthermore, we check 
the dependency of channel holding times in different stages. 
The entropies of the distributions of channel holding times at 
stages 1,2,3 and 4 are 4.0657, 3.4172, 3.3942 and 2.9792, 
respectively, in bits. The entropy of their joint distribution is 
10.2998 bits. Hence, the entropy gap is 4.0657 + 3.4172 + 
3.3942 + 2.9792 - 10.2998 = 3.5565 bits, much larger than 0. 
This shows that the channel holding times at different stages 
are dependent. 

3) AP Locations and Distance Constraint: APs that are far 
away are likely to have little effect on each other, regardless of 
the mobility and session patterns. Therefore, to rigorously test 
the joint distribution of several APs, we are more interested in 
selecting APs located close to each other. We set a distance 
constraint, under which APs are located pairwisely less than 
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Fig. 4. The pdf of channel holding time in different stages. 
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Fig. 5. Comparison of distributions for single APs. Real distributions are in 
solid lines; analytical distributions are in dashed lines. 



500 meters from each other. In the experiments, we will test 
for cases with and without this distance constraint. 



D. Marginal User Distribution at a Single AP 

We first show the marginal user distribution at individual 
APs. For this test, we applied all data after the pre-processing 
described in Section IVI-BI without further exclusions. We 
show a sampling of the 152 APs. In order to avoid selection 
bias, we choose APs according to their numeric identity. For 
each building, we select the AP named API if it exists; 
otherwise, we select AP2. In total, 32 APs are selected. 

Fig. [5] shows a comparison between the real distributions 
and the analytical distributions of these APs. Each subplot is 
labeled with Y or N, where Y indicates that the AP passes 
the two-step Poisson test and N indicates the opposite. The 
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Fig. 7. Hgap and H rea i under the influence of non-Poisson arrivals. 

figure illustrates that the real distributions and the analytical 
distributions agree well with each other for those APs that 
pass the Poisson test. 

E. KL Divergence and Entropy Gap for Multiple APs 

To compare the real and analytical joint distributions of 
multiple APs, we compute the Kullback-Leibler (KL) diver- 
gence Hki between them. We also test the independence of the 
user distributions in different cells by computing the entropy 
gap Hg ap , between the sum of the entropies of real marginal 
distributions and the entropy of the real joint distribution. The 
entropy of the real joint distribution H rea i are also presented 
for reference. 

Given n, the number of APs we aim to study, we randomly 
choose n different APs. Then we compute Hki, H gap , and 
H rea i with respect to these APs. By running this procedure 
100 times, we obtain the sample mean and sample standard 
deviation of Hki, H gap , and H rea i. In subsequent studies, we 
plot the sample mean versus n, along with bars showing one 
sample standard deviation, in Fig.|6]|9] Note that the plot points 
are slightly shifted to avoid overlaps. 

1 ) Influence of Non-Poisson Arrivals: Clearly, excluding 
non-Poisson arrivals could improve the accuracy of the an- 
alytical model. We compare Hki, H gap , and H rea i under 
the conditions of either including or excluding non-Poisson 
arrivals. 

A direct method to exclude non-Poisson session arrivals is 
to remove from the data set all sessions that are initiated at 
invalid APs. However, this will reduce the number of handoff 
session arrivals even in valid APs, hence biasing the analysis. 
An alternate approach is to simply remove the invalid APs 
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Fig. 8. H^i, H ga p and H rea i under the influence of distance restriction. 

from the data set, while allowing those non-Poisson sessions to 
be counted in the valid APs that they pass through. In this way, 
accurate average arrival rates at the valid APs are maintained. 

Thus, we study the following three cases: 1) Excluding 
sessions initiating at invalid APs (i.e., invalid sessions); 2) 
Excluding invalid APs; and 3) Without exclusion. Fig. [6] 
illustrates Hki compared with H rea i for the three cases, and 
Fig. [7] illustrates H gap compared with H rea i for the three 
cases. We observe that both Hki and H gap are much smaller 
than H re ai, when we either exclude invalid sessions or exclude 
invalid APs, illustrating that the real distributions are close to 
the analytical distributions, and the real marginal disulbutions 
of single APs are approximately independent. When we do not 
exclude invalid sessions or invalid APs, Hki and H gap become 
larger, showing that the analytical distribution is influenced by 
the non-Poisson arrivals. However, since there is only a small 
fraction of non-Poisson arrival sessions, Hki and H gap remain 
much smaller than H rea i- 

In addition, excluding invalid sessions only brings small 
decrements in Hu and H gap compared with excluding invalid 
APs. Note that when we exclude invalid sessions, both the 
one-stage and multiple-stage non-Poisson arrival sessions are 
excluded; when we exclude invalid APs, only the one-stage 
non-Poisson arrival sessions are excluded. This illustrates that 
multiple-stage non-Poisson arrival sessions have only weak 
influence on the modeling accuracy. 

2) Influence of Distance Constraint: Fig. [8] shows Hki, 
H g ap, and H rea i with and without the distance constraint. 
For both cases, we exclude the invalid APs. We observe that 
Hki, H g a P , and H rea i are nearly unchanged with or without 
the distance constraint, confirming our expectation that the 
distance constraint does not influence the accuracy of the 
analytical model, since the analytical model predicts that even 
nearby APs have independent user distributions. 

3) Influence of One-Stage Sessions: Fig. [9] shows Hki, 
Hgap, and H rea i with and without the one-stage sessions. For 
both cases, we exclude the invalid APs. We observe that when 
we exclude the one-stage sessions, Hki and H gap becomes 
smaller, suggesting that our model is even more accurate in this 
case. This is an apparently counter-intuitive result, since the 
analytical distribution trivially holds for one-stage sessions. An 
explanation for this is the following. Since one-stage sessions 
are more likely to be new sessions corresponding to attending 
lectures in a classroom, they are more likely to be non- 
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Fig. 9. H g ap and H rea i under the influence of one-stage sessions. 

Poisson. Since not all non-Poisson arrivals can be excluded 
by removing the invalid APs, when we further exclude one- 
stage sessions, we obtain more accurate analytical results. 

Note that one-stage sessions can be analyzed as a single- 
queue model fl4l . Thus, in practice, one may separately 
analyze one-stage and multiple-stage sessions and combine 
the resultant user distributions. 

VII. Conclusions 

In this paper, we have studied the user distribution in 
multicell network by establishing a precise analytical model, 
considering arbitrary user movement and arbitrarily and de- 
pendency distributed channel holding times. We have derived 
the stationary distribution of the number of users in each cell, 
which is only related to the average arrival rate and the average 
channel holding time of each cell, and hance is insensitivity 
with respect to the general movement and session patterns. 
We use the Dartmouth trace to validate our analysis, which 
show that the analytical model is accurate when new session 
arrivals are Poisson and remains useful when non-Poisson 
session arrivals are also included in the data set. 
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